NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

CodeTaxo: Enhancing Taxonomy Expansion with Limited Examples via Code Language Prompts

https://doi.org/10.18653/v1/2025.findings-acl.214

Zeng, Qingkai; Bai, Yuyang; Tan, Zhaoxuan; Wu, Zhenyu; Feng, Shangbin; Jiang, Meng (January 2025, Association for Computational Linguistics)

Full Text Available
Chain-of-Layer: Iteratively Prompting Large Language Models for Taxonomy Induction from Limited Examples

https://doi.org/10.1145/3627673.3679608

Zeng, Qingkai; Bai, Yuyang; Tan, Zhaoxuan; Feng, Shangbin; Liang, Zhenwen; Zhang, Zhihan; Jiang, Meng (October 2024, ACM)

Full Text Available
CodeTaxo: Enhancing Taxonomy Expansion with Limited Examples via Code Language Prompts

Zeng, Qingkai Zeng; Bai, Yuyang; Tan, Zhaoxuan; Wu, Zhenyu; Feng, Shangbin; Jiang, Meng (August 2024, arxiv)

Full Text Available
Knowledge Card: Filling LLMs' Knowledge Gaps with Plug-in Specialized Language Models

Feng, Shangbin; Shi, Weijia; Bai, Yuyang; Balachandran, Vidhisha; He, Tianxing; Tsvetkov, Yulia (May 2024, International Conference on Learning Representations)

By design, large language models (LLMs) are static general-purpose models, expensive to retrain or update frequently. As they are increasingly adopted for knowledge-intensive tasks, it becomes evident that these design choices lead to failures to generate factual, relevant, and up-to-date knowledge. To this end, we propose Knowledge Card, a modular framework to plug in new factual and relevant knowledge into general-purpose LLMs. We first introduce knowledge cards---specialized language models trained on corpora from specific domains and sources. Knowledge cards serve as parametric repositories that are selected at inference time to generate background knowledge for the base LLM. We then propose three content selectors to dynamically select and retain information in documents generated by knowledge cards, specifically controlling for relevance, brevity, and factuality of outputs. Finally, we propose two complementary integration approaches to augment the base LLM with the (relevant, factual) knowledge curated from the specialized LMs. Through extensive experiments, we demonstrate that Knowledge Card achieves state-of-the-art performance on six benchmark datasets. Ultimately, Knowledge Card framework enables dynamic synthesis and updates of knowledge from diverse domains. Its modularity will ensure that relevant knowledge can be continuously updated through the collective efforts of the research community.
more » « less
Full Text Available
KGQuiz: Evaluating the Generalization of Encoded Knowledge in Large Language Models

https://doi.org/10.1145/3589334.3645623

Bai, Yuyang; Feng, Shangbin; Balachandran, Vidhisha; Tan, Zhaoxuan; Lou, Shiqi; He, Tianxing; Tsvetkov, Yulia (May 2024, ACM)

Large language models (LLMs) demonstrate remarkable performance on knowledge-intensive tasks, suggesting that real-world knowledge is encoded in their model parameters. However, besides explorations on a few probing tasks in limited knowledge domains, it is not well understood how to evaluate LLMs' knowledge systematically and how well their knowledge abilities generalize, across a spectrum of knowledge domains and progressively complex task formats. To this end, we propose KGQuiz, a knowledge-intensive benchmark to comprehensively investigate the knowledge generalization abilities of LLMs. KGQuiz is a scalable framework constructed from triplet-based knowledge, which covers three knowledge domains and consists of five tasks with increasing complexity: true-or-false, multiple-choice QA, blank filling, factual editing, and open-ended knowledge generation. To gain a better understanding of LLMs' knowledge abilities and their generalization, we evaluate 10 open-source and black-box LLMs on the KGQuiz benchmark across the five knowledge-intensive tasks and knowledge domains. Extensive experiments demonstrate that LLMs achieve impressive performance in straightforward knowledge QA tasks, while settings and contexts requiring more complex reasoning or employing domain-specific facts still present significant challenges. We envision KGQuiz as a testbed to analyze such nuanced variations in performance across domains and task formats, and ultimately to understand, evaluate, and improve LLMs' knowledge abilities across a wide spectrum of knowledge domains and tasks.
more » « less
Full Text Available
FactKB: Generalizable Factuality Evaluation using Language Models Enhanced with Factual Knowledge

https://doi.org/10.18653/v1/2023.emnlp-main.59

Feng, Shangbin; Balachandran, Vidhisha; Bai, Yuyang; Tsvetkov, Yulia (January 2023, Association for Computational Linguistics)

Evaluating the factual consistency of automatically generated summaries is essential for the progress and adoption of reliable summarization systems. Despite recent advances, existing factuality evaluation models are not robust, being especially prone to entity and relation errors in new domains. We propose FactKB{---}a simple new approach to factuality evaluation that is generalizable across domains, in particular with respect to entities and relations. FactKB is based on language models pretrained using facts extracted from external knowledge bases. We introduce three types of complementary factuality pretraining objectives based on entity-specific facts, facts extracted from auxiliary knowledge about entities, and facts constructed compositionally through knowledge base walks. The resulting factuality evaluation model achieves state-of-the-art performance on two in-domain news summarization benchmarks as well as on three out-of-domain scientific literature datasets. Further analysis of FactKB shows improved ability to detect erroneous entities and relations in summaries and is robust and easily generalizable across domains.
more » « less
Full Text Available

Search for: All records